directed belief network
Iterative Refinement of the Approximate Posterior for Directed Belief Networks
Variational methods that rely on a recognition network to approximate the posterior of directed graphical models offer better inference and learning than previous methods. Recent advances that exploit the capacity and flexibility in this approach have expanded what kinds of models can be trained. However, as a proposal for the posterior, the capacity of the recognition network is limited, which can constrain the representational power of the generative model and increase the variance of Monte Carlo estimates. To address these issues, we introduce an iterative refinement procedure for improving the approximate posterior of the recognition network and show that training with the refined posterior is competitive with state-of-the-art methods. The advantages of refinement are further evident in an increased effective sample size, which implies a lower variance of gradient estimates.
Iterative Refinement of the Approximate Posterior for Directed Belief Networks
Variational methods that rely on a recognition network to approximate the posterior of directed graphical models offer better inference and learning than previous methods. Recent advances that exploit the capacity and flexibility in this approach have expanded what kinds of models can be trained. However, as a proposal for the posterior, the capacity of the recognition network is limited, which can constrain the representational power of the generative model and increase the variance of Monte Carlo estimates. To address these issues, we introduce an iterative refinement procedure for improving the approximate posterior of the recognition network and show that training with the refined posterior is competitive with state-of-the-art methods. The advantages of refinement are further evident in an increased effective sample size, which implies a lower variance of gradient estimates.
Reviews: Iterative Refinement of the Approximate Posterior for Directed Belief Networks
The paper is very clearly written and describes technical concepts in a very comprehensible way. The approach is sound and well motivated and the experimental comparisons with other approaches are fair, though they could have been more extensive in terms of datasets. My greatest concern is about the execution time of the proposed approach, since this is a sequential Monte Carlo method that performs multiple refinement passes for each step of the training process. The authors report convergence curves vs epochs but not vs wall clock time, which should be provided as the main motivation of the paper is to speed up training for this class of generative methods. The experimental section is good in terms of which methods it compares against, but a bit lacking in terms of datasets.
Iterative Refinement of the Approximate Posterior for Directed Belief Networks
Hjelm, Devon, Salakhutdinov, Russ R., Cho, Kyunghyun, Jojic, Nebojsa, Calhoun, Vince, Chung, Junyoung
Variational methods that rely on a recognition network to approximate the posterior of directed graphical models offer better inference and learning than previous methods. Recent advances that exploit the capacity and flexibility in this approach have expanded what kinds of models can be trained. However, as a proposal for the posterior, the capacity of the recognition network is limited, which can constrain the representational power of the generative model and increase the variance of Monte Carlo estimates. To address these issues, we introduce an iterative refinement procedure for improving the approximate posterior of the recognition network and show that training with the refined posterior is competitive with state-of-the-art methods. The advantages of refinement are further evident in an increased effective sample size, which implies a lower variance of gradient estimates.